Korpus: kat_newscrawl_2011

Weitere Korpora

3.7.3 Distribution of the string similarity for different rank ranges

Distribution of the Levenshtein distance for words of rank

String similarity for top-1.000 words
Distance Percentage of words
1 42.6724
2 57.3276
String similarity for top-10.000 words
Distance Percentage of words
1 33.7680
2 66.2320
String similarity for top-100.000 words
Distance Percentage of words
0 0.0004
1 24.7525
2 75.2471
String similarity for top-1.000.000 words
Distance Percentage of words
0 0.0004
1 24.7054
2 75.2942
778 msec needed at 2017-10-22 21:03